Automatic Segmentation of Separately Pronounced Sinhala Words into Syllables
نویسندگان
چکیده
Aligned corpora are widely used in various speech applications like automatic speech recognition, speech synthesis, as well as prosodic and phonetic research. The segmentation into syllables can be done manually or automatically. But it consumes significantly more time for a fully manual phonetic segmentation and practically it is a complicated task because in many cases it requires a large aligned speech corpus. If the manual syllabification is done by a group of individuals then the consistency is decreased because the analysis variations of the individuals. Consequently, there is a dire need for automatic syllabification and it is important because Sinhala language is syllable centric in nature. A method for syllabification of acoustic signals of separately pronounced Sinhala words has been given. Detecting the syllable boundaries was achieved by two main phases and those phases have been described with examples.
منابع مشابه
A Rule Based Syllabification Algorithm for Sinhala
This paper presents a study of Sinhala syllable structure and an algorithm for identifying syllables in Sinhala words. After a thorough study of the Syllable structure and linguistic rules for syllabification of Sinhala words and a survey of the relevant literature, a set of rules was identified and implemented as a simple, easy-to-implement algorithm. The algorithm was tested using 30,000 dist...
متن کامل“ A Review : Different methods of segmenting a continuous speech signal into basic units ”
Speech is the medium through which human beings can communicate. Segmentation of speech is required for better speech recognition. Segmentation of speech can be done into basic units like words, phonemes or syllables. The two main methods used for segmentation of speech signals are manual segmentation and automatic segmentation. But manual segmentation is not favoured as it is tedious, time con...
متن کاملWord segmentation in Persian continuous speech using F0 contour
Word segmentation in continuous speech is a complex cognitive process. Previous research on spoken word segmentation has revealed that in fixed-stress languages, listeners use acoustic cues to stress to de-segment speech into words. It has been further assumed that stress in non-final or non-initial position hinders the demarcative function of this prosodic factor. In Persian, stress is retract...
متن کاملDefining the Gold Standard Definitions for the Morphology of Sinhala Words
In this work, we describe the steps and strategies we carried out on defining morpheme segmentation boundaries of Sinhala words (which we called Gold Standard Definitions). We measured the coverage of the defined resource against three different Sinhala corpora and obtained over 70% coverage for each corpora. Then we report some interesting facts and findings about the Sinhala language revealed...
متن کاملSemi-Automatic Segmentation System for Syllables Extraction from Continuous Arabic Audio Signal
The paper describes a speaker independent segmentation system for breaking Arabic uttered sentences into its constituent syllables. The goal is to construct a database of acoustical Arabic syllables as a step towards a syllable-based Arabic speech verification/recognition system. The proposed technique segments the utterances based on maxima extraction from delta function of 1st MFC coefficient...
متن کامل